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Grace and McLean (2006) proposed a decision model for acquisition of choice in concurrent chains 
which assumes that after reinforcement in a terminal link, subjects make a discrimination whether the 
preceding reinforcer delay was short or long relative to a criterion. Their model was subsequently 
extended by Christensen and Grace (2008, 2009a, 2009b) to include effects of initial- and terminal-link 
duration on choice. We show that an expression for steady-state responding can be derived from the 
decision model, which enables a model for choice that provides an account of archival data that is equal 
or superior to the contextual choice model (Grace, 1994) and hyperbolic value-added model (Mazur, 
2001) in terms of goodness of fit, parsimony, and parameter invariance. The success of the steady-state 
decision model validates the strategy of understanding acquisition phenomena as a bridge toward 
explaining choice at the molar level. 
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The concurrent-chains procedure is com- 
monly used to study choice between reinforce- 
ment outcomes signaled by distinctive stimuli. 
In a typical version of this procedure, pigeons 
peck at two lighted response keys during the 
choice phase or initial links. Concurrent 
variable-interval (VI) schedules operate during 
the initial links, and provide access to one of 
two mutually exclusive terminal-link schedules, 
which are usually signaled by distinctive 
stimuli. Responding during the terminal links 
produces access to food, after which the initial 
links are reinstated. 

Most prior research has used steady-state 
designs in which the same pair of terminal-link 
schedules is maintained until responding has 
stabilized. The terminal-link schedules are 
varied across conditions. Response allocation 
during the initial links is interpreted as a 
measure of the relative value of the terminal- 
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link stimuli as conditioned reinforcers, and 
the primary challenge has been to describe 
how response allocation depends on the initial 
and terminal-link schedules. Various models 
for concurrent chains have been proposed, 
including delay reduction theory (DRT; Fan- 
tino, 1969), the contextual choice model 
(CCM; Grace, 1994) and the hyperbolic 
value-added model (Mazur, 2001), and have 
been shown to account for a substantial 
percentage of variance in response allocation. 

Although differing in specific details, these 
models are related to the matching law and 
share the assumption that choice in the initial 
links depends on the relative value of the 
terminal-link stimuli. However, there are rea- 
sons to question whether this assumption — 
which Grace (2002) termed the value hypothe- 
sis — can be sustained. For example, Grace and 
Nevin (1999) used a procedure in which the 
terminal links included no-food trials similar 
to the peak procedure (Roberts, 1981) so that 
temporal control of responding in the termi- 
nal links could be studied. In their study, 
pigeons were trained with fixed-interval (FI) 
40-s and 20-s terminal links. After 25 sessions, 
initial-link response allocation strongly favored 
the alternative leading to the FI 20 s, and the 
location of peak responding on no-food trials 
was approximately equal to the schedule 
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duration. Next, the pigeons received 25 
sessions of training in which only the termi- 
nal-link stimuli were presented and the FI 40 s 
was changed to FI 10 s. The initial links were 
replaced by an intertrial interval during which 
the keys were dark, so that technically the 
procedure was a multiple peak procedure. The 
location of peak responding adapted rapidly to 
the new schedule values, and was maintained 
over the course of the 25 sessions. The pigeons 
were then returned to concurrent chains. The 
key result was that although responding on no- 
food trials showed that pigeons continued to 
time the 20-s and 10-s delays accurately, initial- 
link choice favored the alternative that led to 
the FI 20-s schedule and required many 
sessions to switch preference. Thus, there was 
a dissociation between choice and timing such 
that pigeons responded more in the initial 
links for the alternative that they “knew”, 
based on their terminal-link responding, was 
associated with the longer delay. These data 
are difficult to reconcile with the view that 
initial-link responding reflects the relative 
value of the terminal-link stimuli, or results 
from sampling of memory distributions asso- 
ciated with each alternative (Gibbon, Church, 
Fairhurst & Kacelnik, 1988; Gallistel & Gibbon, 
2000). 

The challenge that results like Grace and 
Nevin’s (1999) pose for traditional models for 
choice based on matching and conditioned 
reinforcement suggests that rather than at- 
tempting to understand choice from a “top 
down” perspective, that is, by developing a 
quantitative model that can describe steady- 
state allocation at the molar level, it might be 
worthwhile to try a “bottom up” approach. 
Specifically, studying how response allocation 
changes when the terminal-link schedules are 
altered — that is, the dynamics of choice — may 
lead to a model that not only accounts for 
changes in response allocation, but makes 
accurate steady-state predictions as well. In the 
present article, we show that the decision 
model proposed by Grace and McTean 
(2006) and Christensen and Grace (2008) to 
account for acquisition phenomena leads to 
an expression for the effects of terminal-link 
schedules on steady-state choice. In brief, the 
model assumes that when food is obtained in a 
terminal link, subjects make a discrimina- 
tion — a decision — about whether the delay 
from terminal-link onset to food was relatively 


short or long. The tendency to respond to the 
corresponding initial link increases or decreas- 
es if the delay is judged short or long, 
respectively. Response allocation at steady state 
reflects the cumulative effect of these discrim- 
inations. The expression for the effects of 
terminal-link schedules plays the same role as 
conditioned reinforcement in the models of 
Grace (1994) and Mazur (2001), leading to a 
model that can describe the archival data with 
a comparable degree of accuracy. We first 
review the background studies for the decision 
model proposed by Grace and McLean and 
extended by Christensen and Grace (2008, 
2009a, 2009b). We then note an additional 
assumption for the model to provide a realistic 
account of choice, and derive an expression 
for the effects of terminal-link schedules on 
steady-state responding. Finally, we apply the 
resulting model to the same archival data sets 
used by Grace and Mazur, and compare 
performance of the various models. 

Acquisition of Choice in Concurrent Chains 

Flunter and Davison (1985) pioneered the 
use of a pseudorandom binary series (PRBS) 
to study the acquisition of choice behavior. In 
their experiment, the alternative associated 
with the richer reinforcement rate in concur- 
rent VI VI schedules changed unpredictably 
across sessions according to a PRBS. The PRBS 
ensured that whichever alternative was richer 
for a given session was random, and could not 
be predicted on the basis of prior sessions. 
Hunter and Davison showed that pigeons’ 
response allocation adjusted rapidly to chang- 
es in the reinforcer ratio. Subsequently Scho- 
field and Davison (1997) used a lagged 
multiple regression analysis to show after 
sufficient training with the PRBS procedure, 
response allocation in a given session depend- 
ed on the relative reinforcement rate in that 
session and with little evidence of control from 
prior sessions (see also Davison & McCarthy, 
1988). 

Grace, Bragason and McLean (2003) inves- 
tigated whether pigeons’ response allocation 
in concurrent chains could track unpredict- 
able changes in terminal-link reinforcement 
delays across sessions using a similar PRBS 
design. In their Experiment 1, the left termi- 
nal link was always FI 8 s and the right terminal 
link was either FI 4 s or FI 16 s, as determined 
by a 31-step PRBS across sessions. Multiple 


DECISION MODEL 


229 


regression analyses confirmed that after two 
exposures to the series (62 sessions), initial- 
link response allocation depended on the 
terminal-link delays arranged in the current 
session and with negligible influence from 
previous sessions. Results showed that sensitiv- 
ity to reinforcer immediacy increased through 
the first half of the session, and remained 
approximately constant thereafter. 

In Experiment 2, Grace et al. (2003) tested 
whether arranging a unique FI schedule value 
for the right alternative in each session, 
rather than selecting either FI 4 s or FI 16 s, 
would disrupt the acquisition of choice. The 
pigeons were the same as those from Exper- 
iment 1, and training began immediately 
after that experiment was completed. Sur- 
prisingly, Grace et al. found that sensitivity to 
reinforcer immediacy was approximately the 
same as the level reached in Experiment 1 
and did not change systematically over the 
course of training (60 sessions). They con- 
cluded that whatever response strategy the 
pigeons had learned in Experiment 1 was not 
disrupted by the use of different delays for 
the right terminal link in each session in 
Experiment 2. 

Grace and McLean (2006) noted that these 
results — particularly the lack of disruption in 
Experiment 2 — were potentially problematic 
for models of choice based on conditioned 
reinforcement. If response allocation depend- 
ed on the learned value of the terminal-link 
stimuli, then it should have been easier for 
choice to adjust in Experiment 1, where the 
right terminal link changed between two 
values, than in Experiment 2, where the right 
terminal link was sampled from a potentially 
infinite population of values. Thus, Grace and 
McLean conducted an experiment to compare 
response allocation in two conditions: A 
minimum-variation condition which was identi- 
cal to that used by Grace et al. (2003) in 
Experiment 1; and a maximum-variation condi- 
tion in which a different terminal-link FI value 
was arranged for both alternatives in every 
session, with the location of the shorter FI 
determined by a 31-step PRBS. They reasoned 
that if choice depended on the learned value 
of the terminal-link stimuli, then pigeons 
should show greater sensitivity to immediacy 
in the minimum-variation condition. However, 
Grace and McLean found that there was no 
systematic difference in sensitivity between 


the minimum- and maximum-variation condi- 
tions. 

Analysis of data from the maximum-varia- 
tion condition showed two distinct patterns of 
results: In some cases, scatterplots of log 
initial-link response allocation as a function 
of the terminal-link log immediacy ratio were 
approximately linear, consistent with general- 
ized matching. However, in other cases the 
scatterplots showed a nonlinear pattern, in 
which response allocation fell into one of two 
clusters depending on whether the left or right 
alternative was favored. Overall, the relation- 
ship appeared to be sigmoidal, with a greater 
difference between the clusters than within 
them (see their Figure 4). Grace and McLean 
(2006) suggested that a process akin to 
categorical discrimination may have influ- 
enced responding — that is, the pigeons 
learned to respond more in each session to 
whichever alternative led to the shorter termi- 
nal link delay, but how much that delay was 
shorter than the alternative had little influence 
over choice. 

Grace and McLean (2006) proposed a 
model which could account for the different 
patterns of results in the maximum-variation 
condition. They assumed that after reinforce- 
ment in a terminal link, pigeons made a 
decision about whether the preceding rein- 
forcement delay was short or long relative to a 
criterion. If the delay is judged short, response 
strength for the associated initial link increas- 
es, whereas if the delay is judged long, then 
response strength decreases. Response alloca- 
tion is then predicted by the ratio of the 
response strengths. Changes in response 
strength are made according to a linear- 
operator rule (with parallel equations for left 
and right alternatives): 

Ar, 1+ i = a(r max — r n ) with probability p and 

— a(r n — r m j„) with probability 1 —p, so that (1) 

Ar n 4 _ i = p oc(r m ax r n ) + (1 ^?)( oc)(r n r m i n ) . 

According to Equation 1, Ar n+ i (change in 
expected response strength for cycle n+ 1) is 
a function of response strength on the 
previous cycle (r n ), and an additive or subtrac- 
tive term, depending on whether the delay on 
cycle n was judged as short or long, respec- 
tively. If the previous delay was judged as short 
(with probability p), the response strength 
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increases by a proportion (a) of the difference 
between the maximum response strength 
(r max ) and current response strength, whereas 
if the delay was judged as long (with probabil- 
ity 1 — p), response strength decreases by a 
proportion of the difference between the 
current and minimum response strength 
(r, n i 1 1 ) ■ 

The model assumes that all delays are scaled 
logarithmically, and computes the probability 
of a “short” decision as the area under a 
normal distribution to the right of the previous 
delay, log D. The mean of the distribution is the 
average of the log delays across both alterna- 
tives, and is referred to as the criterion (log C) . 
The standard deviation (a) is a parameter in 
the model, which determines the accuracy with 
which delays are judged as short or long. 
Specifically, the probability, p, that a delay, log 
D, is judged short is 1 — ®(log D, log C, csj, 
where ® is the cumulative normal distribution 
with mean = log C and standard deviation = a 
evaluated at log D. Grace and McTean (2006) 
showed that when a was relatively high, 
classification decisions were less accurate and 
response allocation was approximately a linear 
function of the log immediacy ratio (i.e., 
generalized matching). However, when a was 
relatively small, decisions were more accurate 
and response allocation was a nonlinear (sig- 
moidal) function of the log immediacy ratio. 
Grace and McTean fitted the model to the 
results from individual subjects and showed 
that it accounted for the major features of the 
data. 

Christensen and Grace (2008) noted that a 
major limitation of the decision model was 
that it was unable to account for effects of 
overall initial- and terminal-link duration, 
which are well established in the literature 
(Berg & Grace, 2006; Grace & Bragason, 
2004). For example, Fantino (1969) showed 
that preference between a constant pair of 
terminal links became less extreme when the 
initial-link schedules increased. An effect of 
overall terminal-link duration on preference 
was first reported by MacEwen (1972), who 
found that preference for the shorter of two 
terminal-link schedules in a constant ratio 
increased as their overall duration increased. 
The initial- and terminal-link effects were 
influential in the development of delay-reduc- 
tion theory (see Fantino, Preston & Dunn, 
1993; Fantino & Romanowich, 2007, for 


reviews) and are predicted by other steady- 
state models for choice (Grace, 1994, 2004; 
Mazur, 2001). 

Christensen and Grace (2008) proposed 
that the criterion in the decision model could 
be calculated as the average of all interstimulus 
intervals correlated with reinforcement. Thus 
both intervals between initial-link onset and 
terminal-link entry, as well as between termi- 
nal-link onset and reinforcement, were includ- 
ed. The rationale was that when making 
decisions about whether delays were short or 
long, pigeons did not clearly discriminate 
between initial- and terminal-link intervals. 
They showed that when the criterion was 
computed in this way, the decision model 
predicted the initial-link effect. As initial-link 
duration increased, the criterion increased 
and thus p increased for both terminal links. 
However, they showed that p increased more 
slowly for the shorter schedule, producing an 
attenuated preference. Moreover, the model 
also predicted that preference would decrease 
for very short initial-link durations, which was 
confirmed in an experiment. 

Christensen and Grace (2009a) showed that 
the decision model also predicted the termi- 
nal-link effect when both initial- and terminal- 
link delays contributed to the criterion. They 
showed that when terminal-link duration was 
increased, the criterion also increased but less 
than proportionally. Consequently the predict- 
ed preference (which is determined by the 
ratio of p for the left and right terminal links) 
increased. They reported an experiment using 
a rapid-acquisition design that confirmed both 
the increased sensitivity to reinforcer immedi- 
acy predicted by the terminal-link effect, but 
also the less-than-proportional increase in the 
criterion. 

Christensen and Grace (2009b) made two 
further additions to the decision model. They 
included a linear-operator term to account for 
changes in response strength across sessions, 
and proposed an exponentially weighted 
moving average (EWMA; Killeen, 1981) for 
updating the criterion: 

log Cjv+i = P(l°g Av) + (1 — P)log C N . (2) 

The criterion is assumed to be updated after 
every transition between stimuli (i.e., initial link 
to terminal link, and terminal-link to reinforce- 
ment or at terminal-link entry and after 
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reinforcement might be better). In Equation 2, 
log C'v arid log C V -; are the criterion values after 
stimulus transitions N and A-l , respectively, log 
D n is the Mh stimulus-transition interval, and P 
indicates how much weight is given to the most 
recent interval. 

Here we propose one further change in the 
decision model. Because all of the previous 
studies which have tested the decision model 
have used FI terminal links, there was no 
variability in the delay to reinforcement associ- 
ated with a given terminal link in each session. 
However, when VI schedules are used, it may be 
more difficult for subjects to discriminate 
whether the delay just experienced was short 
or long. As response strength is updated after 
reinforcement delivery (according to Equation 
1), the decision must be made retrospectively, 
and when delays for a particular alternative are 
variable within session, subjects’ memory for 
the just-experienced delay may be influenced to 
some extent by previous delays in the session. 
Thus we will assume that the subjects’ memory 
for just-experienced delay can be calculated as a 
EWMA of the history of delays for that 
alternative. Separate EWMAs are calculated 
for each of the terminal links. For simplicity, 
we will also assume that all VI schedules use 
exponentially distributed intervals (Fleshier & 
Hoffman, 1962). 

This addition to the model has two major 
consequences. First, it allows the model to 
predict preference for variability, that is for VI 
as over FI as. The reason is that although the 
arithmetic mean delay may be equal for VI x 
and FI x, the average of the log delays (i.e., the 
geometric mean) will be lower for VI x. 
Second, because the geometric mean of a VI 
distribution is less that of an FI distribution 
(which equals the arithmetic mean), the 
model will predict less extreme preference 
when the terminal links are both VI sched- 
ules (e.g., VI x VI y) compared to correspond- 
ing FI schedules (FI a FI y). Thus the model 
predicts more extreme preference with FI FI 
terminal links for the same reason that it 
predicts the terminal-link effect: When the 
overall duration of the terminal links increas- 
es, the criterion increases but less than 
proportionally. 

Steady-State Decision Model 

We can now derive the steady-state predic- 
tions for the decision model. Given sustained 


exposure to the same terminal links, Equation 
1 predicts that response strength for each 
initial link will reach an asymptotic value. The 
asymptote may be obtained by setting a = 1 in 
Equation 1 and simplifying the resulting 
expression to yield: 

r co = p( r max) + ( 1 P)(^ min)- (3) 

Equation 3 states that the asymptotic response 
strength is a weighted average of the maxi- 
mum and minimum response strengths (r max 
and r min ) depending on the probability that a 
delay associated with the corresponding ter- 
minal link is judged short ( p ). In all subse- 
quent analyses, r max and r min were set equal to 
1 and .01, respectively, similar to our previous 
applications of the decision model (Christen- 
sen & Grace, 2008, 2009a, 2009b). Predicted 
response allocation is then given by the ratio 
of the asymptotic response strengths: 

Bl r r xL p L ^max + (l-/>Z>min ^ 

Br J'ooR pR ^max + (1 ~pR)r min 

where L and R represent calculations for the 
left and right alternatives, respectively. Equa- 
tion 4 is an expression for the effects of 
sustained training with a given pair of termi- 
nal-link schedules on initial link choice. The 
probability that a delay (log D) is judged short 
relative to the criterion (log C) is computed as 
the probability that a random sample from a 
normal distribution with mean equal to log C 
and standard deviation cj is more than log D: 

p=i-®(logD,\ogC,a), (5) 

where 4> is the cumulative normal distribution 
evaluated at log D. 

For each terminal link, log D is calculated as 
the log of the geometric mean reinforcement 
delay. For FI schedules, log D = the log of the 
schedule value. For VI schedules, the intervals 
were randomized from a set of 12 intervals 
based on an exponential progression (Fleshier 
& Hoffman, 1962); log D was the log geometric 
mean of this distribution. For log C, we assumed 
that the distribution of times spent in the initial 
link could be approximated by a 12-interval 
exponential progression with a mean equal to 
the average time spent in the initial link. Log C 
was then calculated as (log l)/+ log Dj+ log D L + 
log Dj{) / 4, where log Dj is the log geometric 
mean of the initial-link intervals. 
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Archival Data Analyses 

Next we compare the ability of the decision 
model to account for results from steady-state 
concurrent-chains experiments with that of 
two previous models: the contextual choice 
model (CCM; Grace, 1994), and the hyperbol- 
ic value-added model (HVA; Mazur, 2001). 
Specific details of CCM and HVA are present- 
ed in the articles cited and we will not repeat 
them here. However, we note that both models 
are based on the generalized matching law 
(Baum, 1974) and take the following form: 


Bl 

Br 



, or in logarithmic terms, 


j^L 

log-j-- =log 6+ a log 
Br 


Rl \ 
Rr) 


T log 


(»' 


( 6 ) 


where B L , B R are initial-link responses, R L and 
R r are the rates of entering the terminal links, 
and V L , V R are the values of the terminal links, 
a is a sensitivity parameter and b is bias. 
According to Equation 6, initial-link response 
allocation matches the relative frequency of 
conditioned reinforcement (i.e., terminal-link 
entry) provided by the choice alternatives with 
sensitivity a and bias b, and with a concatenat- 
ed term (additive in the logarithmic version) 
that represents the effects of relative terminal- 
link value. For the decision model, the value 
ratio in Equation 6 is replaced with Equation 4 
(response strength ratio). Thus, like CCM and 
HVA, the decision model assumes that effects 
of terminal-link schedules on choice are 
additive with the effects of the relative fre- 
quency of entering the terminal links (cf. 
Fantino & Romanowich, 2007) . 

The decision model was fitted to the same 
archival data sets analyzed by Grace (1994) 
and Mazur (2001). In addition, CCM and HVA 
were fitted. The archival data were composed 
of 19 concurrent-chains studies published 
before 1994 and based on the following 
criteria: (a) minimum of four data points for 
each subject, (b) time-based terminal-link 
schedules, either FI or VI; and (c) equal 
terminal-link reinforcer magnitudes. In addi- 
tion to these criteria, we also omitted several 
conditions with unequal initial-link schedules 
in which one of the initial links was VI 0 s, 
from Fantino and Davison (1983; 1 of 56 
conditions) and Davison (1983; 5 of 61 
conditions). Because all models made identi- 


cal predictions for Squires and Fantino (1971; 
unequal initial links; equal terminal links) , this 
data set was omitted from the analyses. Overall, 
a total of 1463 individual-subject data points 
from 18 studies were analyzed 1 * * * * & . For all studies, 
response allocation was scaled as the log 
initial-link response ratio. Thus, a logarithmic 
version of each model was fitted to the data. 

For all models, parameters were estimated 
that maximized the variance accounted for 
using Microsoft Excel Solver. For the decision 
model, there were two parameters fitted to all 
data sets in which terminal-link entry rates 
were equal (log b and cr), and three for data 
sets in which rates were unequal (log b, a, and 
a) . For CCM and HVA, we first used the same 
number of parameters as the decision model, 
that is, either two or three depending on 
whether terminal-link entry rates were equal 
(CCM: log b, «| , and a%; HVA: log b, a i; a^). 
However, both models contain an additional 
parameter ( k ) which was used by both Grace 
(1994) and Mazur (2001) to provide an 
adequate fit to studies with uncued terminal 
links. Here we used the following rule: Both 
HVA and CCM were initially fitted to the data 
without letting the k parameter vary. If the 
variance accounted for was less than 80%, then 
the model was refitted while allowing k to vary. 
If the variance accounted for improved by 
more than 5%, then the fit with the k 
parameter was used, otherwise the original fit 
was retained. Thus in all cases, the decision 
model had the same number or fewer param- 
eters as CCM and HVA. 

HVA predicts exclusive preference when 
one terminal link or the other does not signal 
an increase in reinforcement value. In this 
case, because exclusive preference is not 
possible to achieve on a logarithmic scale, we 
used a maximum predicted response ratio of 
100:1 (or 1:100). This ensured that HVA would 
have the same maximum predicted preference 
as the decision model given our choice for r max 


1 The 18 studies included in the archival analysis were: 

Alsop & Davison, 1988; Chung & Herrnstein 1967; Davison 

1976, 1983, 1988; Davison & Temple 1973; Duncan & 

Fantino, 1970; Dunn & Fantino, 1982; Fantino, 1969; 

Fantino & Davison, 1983; Fantino & Royalty, 1987; Gentry 

& Marr, 1980; Killeen, 1970; MacEwen, 1972; Omino & Ito, 
1993; Preston & Fantino, 1991; Wardlaw & Davison, 1974; 
Williams & Fantino, 1978. 
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Table 1 shows details of the model fits. 
Averaged across 18 studies (which included 
87 data sets and 1463 data points), the 
variance accounted for (VAC) by the decision 
model (DM), HVA and CCM was 88.3%, 
84.5%, and 87.6%, respectively. The corre- 
sponding medians were 90.4%, 85.5%, and 
88.1%. Across the studies, the minimum and 
maximum VAC were: DM, 73% (Gentry & 
Marr) and 99% (Duncan & Fantino, 1970); 
HVA, 63% (Gentry & Marr, 1980) and 94% 
(Davison, 1976); and CCM, 76% (Fantino & 
Royalty, 1987) and 97% (Davison, 1976). This 
shows that all three models provided a 
reasonably accurate description of the data. 
Notably, the DM required fewer fitted param- 
eters (n = 202) compared to HVA ( n = 223) 
and CCM ( n = 228), even while it accounted 
for slightly more variance. 

We conducted a residual analysis to deter- 
mine whether there were systematic deviations 
of the data from predictions of each model 
(Sutton, Grace, McLean & Baum, 2008). 
Figure 1 plots residual scores (obtained-pre- 
dicted) pooled across data sets as a function of 
the predicted values for each model. Because 
the models incorporate bias in structurally the 
same way (i.e., as an additive term), estimates 
of log b were subtracted from the predicted 
values prior to the residual analysis. As Sutton 
et al. noted, removing variance in bias across 
studies should result in a more sensitive test of 
systematic trends in the residuals. 

Figure 1 shows that there appears to be a 
similar systematic trend in the residuals of 
each model: For strongly negative predicted 
values, the residuals tend to be greater than 
zero, then decrease below zero as the predict- 
ed value increases, then increase as the 
predicted values become positive, and then 
finally decrease and become less than zero for 
strongly positive predicted values. This trend 
was confirmed by results of polynomial regres- 
sions. In these analyses, we regressed the 
residuals against the predicted values (bias 
free) and their cube, and tested the signifi- 
cance of the linear and cubic components. 
Note that quadratic components (i.e., the 
square of the bias-free predicted values) were 
excluded because this function (U shape or 
inverted U shape) is not invariant under 
admissible transformations of the response 
ratio, in which left/ right or right/left is 
arbitrary (see Sutton et al., 2008). 


Table 2 shows the beta coefficients for the 
linear and cubic components, and the if 
value, for the polynomial regressions. Results 
showed that, for each model, the cubic 
component was significantly negative, and 
the linear component was significantly posi- 
tive. These coefficients confirm that the 
pattern described above was statistically signif- 
icant for each model. The if value was lowest 
for the DM and highest for CCM, with HVA in 
the middle. However, it is notable that the 
pattern was identical in all cases, indicating 
that each model failed to account fully for the 
data in a similar way. 

Some insight about how the data deviated 
from the models’ predictions is provided by 
Figure 2, which shows the results of the 
polynomial regressions in terms of an ob- 
tained versus predicted scatterplot. If the 
residuals showed no systematic pattern, the 
obtained data would fall exactly on the solid 
major diagonal (i.e., obtained = predicted). 
However, the curved functions are based on 
the polynomial regressions, and show that the 
obtained data deviated from the models’ 
predictions in a similar way. As expected from 
the regression coefficients in Table 2, the 
strength of the pattern was strongest for HVA 
and weakest for the DM, with CCM in the 
middle. 

Finally, we conducted an analysis to deter- 
mine whether the models’ sensitivity parame- 
ters were invariant with respect to whether VI 
or FI terminal links were used. Grace (1994) 
reported that for CCM, sensitivity to relative 
terminal-link immediacy was greater for FI 
than VI terminal links. Thus the studies were 
separated into two groups depending on 
whether terminal links were both FI (n = 12) 
or both VI (n = 6)". For CCM, sensitivity values 
( a 2 ) were significantly greater for FI terminal 
links (M = 1.79) than VI terminal links (M = 
0.81), t{ 85) = 3.60, p < .001. There was an 
opposite trend for HVA: a t was greater for VI 
terminal links ( M = 0.86) than FI terminal 


"The studies with FI terminal links were: Chung & 
Herrnstein, 1967; Davison 1976, 1983, 1988; Davison & 
Temple 1973; Duncan & Fantino, 1970; Gentry & Marr, 
1980; Killeen, 1970; MacEwen, 1972; Omino & Ito, 1993; 
Wardlaw & Davison, 1974; and Williams & Fantino, 1978. 
The studies with VI terminal links were; Alsop & Davison, 
1988; Dunn & Fantino, 1982; Fantino, 1969; Fantino & 
Davison, 1983; Fantino & Royalty, 1987; and Preston & 
Fantino, 1991. 


234 


DARREN R. CHRISTENSEN and RANDOLPH C. GRACE 


Table 1 

For the archival studies listed, average estimated parameter values, variance accounted for 
(VAC), number of data sets per study, number of data points per study, and number of 
parameters fitted per study for the decision model (DM), hyperbolic value-added model (HVA; 
Mazur, 2001), and contextual choice model (CCM; Grace, 1994). 


Archival Study 




Decision Model (DM) 




log b 

Parameters 

a 



VAC 


#Data Sets 

n 

T^Params 

Alsop & Davison 1988 

0.11 

0.79 

0.52 


0.88 


6 

156 

18 

Chung 8c Herrnstein 1967 

0.18 

1.00 

0.22 


0.88 


6 

54 

12 

Davison 1976 

0.24 

0.43 

0.17 


0.97 


1 

20 

3 

Davison 1983 

0.02 

0.76 

0.38 


0.82 


6 

314 

18 

Davison 1988 

0.04 

1.00 

0.46 


0.92 


6 

135 

12 

Davison 8c Temple 1973 

0.00 

1.00 

0.32 


0.90 


8 

156 

16 

Duncan & Fantino 1970 

0.08 

1.00 

0.10 


0.99 


2 

12 

4 

Dunn & Fantino 1982 

0.17 

1.00 

0.17 


0.77 


4 

24 

8 

Fantino 1969 

-0.40 

1.00 

0.15 


0.91 


4 

16 

8 

Fantino & Davison 1983 

-0.05 

0.16 

0.24 


0.90 


6 

330 

18 

Fantino & Royalty 1987 

0.14 

1.00 

0.28 


0.77 


6 

42 

12 

Gentry & Marr 1980 

0.03 

1.00 

0.56 


0.73 


4 

36 

8 

Killeen 1970 

-0.13 

1.00 

0.20 


0.96 


4 

16 

8 

MacEwen 1972 

0.17 

1.00 

0.19 


0.97 


4 

16 

8 

Omino 8c Ito 1993 

0.09 

1.00 

0.31 


0.91 


6 

27 

12 

Preston 8c Fantino 1991 

0.06 

0.62 

0.29 


0.79 


9 

65 

27 

Wardlaw & Davison 1974 

0.11 

1.00 

0.24 


0.92 


1 

20 

2 

Williams & Fantino 1978 

0.27 

1.00 

0.12 


0.91 


4 

24 

8 

Average 





0.883 

Total 

87 

1463 

202 




Hyperbolic Value-Added Model (HVA) 





Parameters 









log b 

«i 

«t 

k 

VAC 


#Data Sets 

n 

#Params 

Alsop 8c Davison 1988 

0.04 

0.79 

0.62 

0.20 

0.88 


6 

156 

18 

Chung 8c Herrnstein 1967 

0.11 

1.00 

0.76 

0.21 

0.89 


6 

54 

13 

Davison 1976 

0.26 

0.35 

1.14 

0.20 

0.94 


1 

20 

3 

Davison 1983 

0.05 

0.73 

0.66 

1.83 

0.84 


6 

314 

24 

Davison 1988 

0.04 

1.00 

0.86 

0.20 

0.87 


6 

135 

12 

Davison & Temple 1973 

-0.06 

1.00 

0.33 

0.99 

0.81 


8 

156 

20 

Duncan 8c Fantino 1970 

0.07 

1.00 

1.09 

0.20 

0.83 


2 

12 

4 

Dunn 8c Fantino 1982 

0.16 

1.00 

1.00 

0.00 

0.93 


4 

24 

12 

Fantino 1969 

-0.47 

1.00 

1.07 

0.20 

0.78 


4 

16 

8 

Fantino & Davison 1983 

-0.11 

0.15 

0.73 

1.83 

0.76 


6 

330 

19 

Fantino & Royalty 1987 

0.13 

1.00 

1.14 

0.17 

0.78 


6 

42 

13 

Gentry & Marr 1980 

0.02 

1.00 

0.45 

0.07 

0.63 


4 

36 

12 

Killeen 1970 

-0.11 

1.00 

0.79 

0.20 

0.93 


4 

16 

8 

MacEwen 1972 

0.30 

1.00 

0.88 

0.20 

0.79 


4 

16 

8 

Omino 8c Ito 1993 

0.05 

1.00 

0.53 

0.20 

0.91 


6 

27 

12 

Preston & Fantino 1991 

0.01 

0.54 

0.75 

0.20 

0.81 


9 

65 

27 

Wardlaw & Davison 1974 

0.11 

1.00 

0.69 

0.20 

0.91 


1 

20 

2 

Williams & Fantino 1978 

0.26 

1.00 

1.33 

0.20 

0.91 


4 

24 

8 

Average 





0.845 

Total 

87 

1463 

223 




Contextual Choice Model (CCM) 





Parameters 









log b 

«i 

«2 

k 

VAC 


#Data Sets 

n 

#Params 

Alsop 8c Davison 1988 

0.19 

0.81 

0.39 

1.00 

0.88 


6 

156 

18 

Chung 8c Herrnstein 1967 

0.17 

1.00 

2.77 

1.24 

0.87 


6 

54 

13 

Davison 1976 

0.27 

0.67 

3.26 

1.00 

0.97 


1 

20 

3 

Davison 1983 

-0.02 

0.81 

0.97 

0.47 

0.81 


6 

314 

24 

Davison 1988 

0.04 

1.00 

0.42 

0.33 

0.92 


6 

135 

18 
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Table 1 
( Continued) 


Archival Study 



Contextual Choice Model (CCM) 



log b 

Parameters 

«i 

«2 

k 

VAC 

#Data Sets 

n 

#Params 

Davison & Temple 1973 

-0.05 

1.00 

1.18 

0.62 

0.86 


8 

156 

20 

Duncan & Fantino 1970 

0.07 

1.00 

4.67 

1.00 

0.95 


2 

12 

4 

Dunn 8c Fantino 1982 

0.16 

1.00 

1.87 

1.00 

0.96 


4 

24 

8 

Fantino 1969 

-0.24 

1.00 

1.47 

1.00 

0.92 


4 

16 

8 

Fantino 8c Davison 1983 

-0.09 

0.16 

0.83 

0.74 

0.84 


6 

330 

21 

Fantino 8c Royalty 1987 

0.25 

1.00 

1.00 

0.93 

0.76 


6 

42 

13 

Gentry & Marr 1980 

0.02 

1.00 

0.90 

0.47 

0.76 


4 

36 

12 

Killeen 1970 

-0.15 

1.00 

2.18 

0.75 

0.94 


4 

16 

9 

MacEwen 1972 

0.45 

1.00 

1.31 

1.00 

0.77 


4 

16 

8 

Omino & Ito 1993 

0.00 

1.00 

0.93 

1.00 

0.95 


6 

27 

12 

Preston 8c Fantino 1991 

0.35 

0.75 

0.18 

1.00 

0.77 


9 

65 

27 

Wardlaw & Davison 1974 

0.10 

1.00 

2.00 

1.00 

0.88 


1 

20 

2 

Williams & Fantino 1978 

0.26 

1.00 

5.18 

1.00 

0.94 


4 

24 

8 

Average 





0.876 

Total 

87 

1463 

228 


links (M = 0.72), but the difference failed to 
reach significance, t( 85) = 1.80, p = .07. For 
the decision model, the average sensitivity (a) 
was nearly equal for FI (M = 0.30) and VI 
terminal links (M = 0.29), t(85) = 0.30, ns. 

This analysis shows that parameter estimates 
which measure sensitivity to terminal-link 
schedules were overall more consistent for 
the decision model than for CCM and FIVA. 
For CCM and HVA, sensitivity parameters 
tended to vary depending on the type of 
terminal-link schedule, whereas for the deci- 
sion model they did not. This suggests that the 
decision model performs better than CCM and 
FIVA on the criterion of parameter invariance 
(Nevin, 1984). 

DISCUSSION 

The goal of the present study was to 
determine if the decision model proposed 
for acquisition of choice in concurrent chains 
by Grace and McLean (2006) and Christensen 
and Grace (2008, 2009a) could produce a 
viable model for steady-state responding, and 
to compare its accuracy with that of previous 
models (CCM; Grace, 1994; FTVA; Mazur, 
2001). To accomplish this, we derived an 
expression for asymptotic relative response 
strength (Equation 4), representing the effects 
of terminal-link schedules on initial-link re- 
sponding, and used it in the generalized- 
matching law framework adopted by previous 
models (Equation 6) as a replacement for 


relative terminal-link value. The resulting 
model accounted for slightly more variance 
in log initial-link responding (88.3%) than 
CCM and FIVA (87.6% and 84.5%, respective- 
ly) across a range of archival studies while 
requiring about 10% fewer free parameters in 
total. Moreover, the decision model showed 
no evidence of systematic differences in 
parameter estimates for studies with VI and 
FI terminal links, unlike CCM and, to a lesser 
extent, HVA. We therefore conclude that the 
decision model, originally developed to ex- 
plain individual differences in pigeons’ re- 
sponding under dynamic conditions in which 
terminal links changed unpredictably across 
sessions (Grace & McLean, 2006) provides an 
account of molar, steady-state choice that is at 
least as good as, and arguably perhaps better 
than, existing models. 

However, the decision model, like CCM and 
HVA, does not provide a complete account of 
choice. Analysis of residuals found that a 
similar pattern of systematic deviations was 
present in the predictions of all three models, 
which could be characterized as a third-order 
polynomial with positive linear and negative 
cubic components. The simplest interpreta- 
tion of this pattern is that over an approxi- 
mately 4-logio unit range, log response alloca- 
tion is a nonlinear (sigmoidal) function, 
increasing more rapidly and then less rapidly 
as preference moves away from indifference, 
whereas the decision model, HVA, and CCM 
all predict that the rate of increase in log 
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DM- Pooled Data 

2 


1 - 



-1 - 


-2 J 


HVA - Pooled Data 

2 1 



-2 


CCM - Pooled Data 

2 -| 



-2 J 

Bias-Free Predicted Response Allocation 

Fig. 1. Residual values (obtained-predicted; n = 
1463) plotted as a function of bias-free predicted values 
for the decision model (DM), CCM and HVA. 

response allocation should show less change. 
This pattern is most clearly apparent in 
relation to CCM, which predicts that when 
overall initial- and terminal-link durations are 
constant, log response allocation is a linear 
function of the log terminal-link immediacy 
(i.e., reciprocal of delay) ratio. In contrast to 
CCM, the decision model predicts that log 



Predicted 


Fig. 2. Obtained versus predicted log response alloca- 
tion scatterplot showing fitted cubic polynomials based on 
the regression analysis for each model’s residuals. If the 
residuals bore no systematic relationship to predicted 
values, the fitted functions would correspond to the major 
diagonal (dark line), representing obtained = predicted. 
The curvature in the dashed lines indicates how the 
obtained data deviated systematically from the predictions 
of the DM, HVA, and CCM. 

response allocation should be a sigmoidal 
function of the log terminal link immediacy 
ratio, with the degree of nonlinearity deter- 
mined by the parameter a (see Grace & 
McLean, 2006, Figure 6). However, the resid- 
ual analysis shows that the decision model fails 
to capture the full extent of the nonlinearity in 
the data. 

The variance accounted for by CCM and 
HVA is somewhat lower than that reported by 
Grace (1994) and Mazur (2001), who found 
that the models accounted for approximately 
90% of the variance in response allocation. 
This can be attributed to their use of choice 
proportions rather than log ratios, which 
impose ceiling and floor effects and thus limit 
the deviations of obtained from predicted 
values for relatively extreme preference condi- 

Table 2 

Results of polynomial-regression analysis of residual scores. 
Shown are the beta coefficients for the linear and cubic 
polynomial components and R 1 , for the DM, CCM and 
HVA models. 


Model 

Linear 

Cubic 

if 

DM 

0.08*** 

— o 04*** 

.042 

HVA 

0.19*** 

-0.06*** 

.081 

CCM 

0.14*** 

— 0.05*** 

.103 


***/? < .001 
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tions. It is notable that when the analyses in 
the present article were carried out using 
choice proportions (not reported here), no 
systematic deviations were found in the resid- 
uals of any model. This confirms that log ratios 
provide a more sensitive assay of response 
allocation, and should be used instead of 
choice proportions, particularly when models 
are fitted. 

Unlike CCM and HVA, the decision model 
was able to account for the more extreme 
preference observed with FI terminal links but 
with no systematic change in estimated values 
of the sensitivity parameter. The reason that 
the decision model is able to predict less 
extreme preference with VI terminal links is 
that the use of an exponentially-weighted 
average of log delays to compute the criterion 
yields a lower value when terminal links are VI 
than FI. Thus the decision model predicts 
more extreme preference with FI than VI 
terminal links for the same reason that it 
predicts an effect of overall terminal-link 
duration (Christensen & Grace, 2009a): Use 
of FI terminal links results in a longer criterion 
delay compared to VI terminal links with the 
same average reinforcement delay. 

The success of the decision model in 
specifying an expression for steady-state re- 
sponding validates the strategy of studying 
acquisition as a means toward explaining 
molar choice. The rapid acquisition design in 
which terminal-link schedules change unpre- 
dictably across sessions according to a PRBS 
(Hunter & Davison, 1985) is ideally suited for 
this purpose, because it yields learning curves 
within individual sessions. Experiments based 
on this design can easily generate sufficient 
data points to distinguish between linear and 
nonlinear response allocation (Grace & 
McLean, 2006; Kyonka & Grace, 2007), which 
is more difficult in steady-state designs because 
many sessions are required to obtain each data 
point. 

It is also important to note that the 
incremental modifications to the decision 
model proposed by Christensen and Grace 
(2008, 2009a, 2009b) and here do not funda- 
mentally change the structure of the decision 
model as initially specified by Grace and 
McLean (2006). For example, Christensen 
and Grace’s (2008) proposal — including the 
initial-link intervals in the criterion — did not 
change predictions for Grace and McTean 


because the prior study did not vary initial-link 
duration. Similarly, calculating the delay to be 
judged short or long as a EWMA of previous 
delays on an alternative — which was necessary 
here to account for the preference between VI 
schedules — does not affect the previous appli- 
cation of the decision model to FI schedules. 
Our strategy has been to make the necessary 
changes to the decision model in a step-by-step 
fashion, thus increasing its generality and 
extending it to a broader range of situations. 
The alternative approach of defining a com- 
plete model at the outset would not have 
worked, as it would have been unnecessarily 
complex for the initial applications. Further 
elaboration of the model will be necessary to 
extend its scope further, for example to 
incorporate the effects of reinforcer magni- 
tude and probability. 

At a more theoretical level, the decision 
model provides an alternative to conditioned 
reinforcement as an explanation for initial- 
link responding in concurrent chains. Accord- 
ing to the traditional view shared by models 
such as DRT, CCM and HVA, terminal-link 
stimuli acquire the capacity to reinforce 
responding through a process akin to Pavlov- 
ian conditioning, and consequently response 
allocation during the initial links reflects the 
relative conditioned value of the terminal-link 
stimuli. In contrast, the decision model as- 
sumes that differential initial-link responding 
results from the cumulative effect of making 
discriminations about terminal-link delays. 
According to the decision model, what is 
learned and expressed as response allocation 
in concurrent chains is the relative propensity 
to respond in the presence of the initial-link 
stimuli. Regarding the terminal links, the 
decision model assumes that subjects learn 
the reinforcer delays signalled by the stimuli 
(represented by the EWMA), and so those 
stimuli can provide discriminative control for 
terminal-link responding. Thus, the decision 
model is able to accommodate the results of 
experiments which have examined temporal 
control of terminal-link responding (e.g., 
Grace & Nevin, 1999; Kyonka & Grace, 
2007), which are problematic for accounts 
based on conditioned reinforcement. The 
dissociation between choice and timing re- 
ported by Grace and Nevin occurs because the 
determiners of responding in the initial- and 
terminal links are different. Initial-link re- 
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sponding is updated through a retrospective 
process (i.e., decisions about recent terminal 
link delays) and requires the initial-link stimuli 
to be present for the effects of those decisions, 
in terms of changes in response strength, to be 
made. 

Effects of temporal context on choice — that 
is, overall initial- and terminal-link duration — 
are among the most important results in 
concurrent chains. Previous models have 
explained these effects in terms of how 
conditioned reinforcement depends on tem- 
poral context (Fantino, 1969; Mazur, 2001), or 
temporal context modulates the sensitivity of 
choice to terminal-link value (Grace, 1994). 
The decision model is different from these 
accounts because it assumes that temporal 
context effects result essentially from a confu- 
sion of initial- and terminal-link stimuli: 
Whereas optimal decisions regarding which 
terminal link had the shorter delay would 
require comparison with a criterion that 
depended solely on terminal-link delays, ac- 
cording to the decision model the intervals 
between initial-link onset and terminal-link 
entry also contribute to the criterion, and 
explain why temporal context effects occur. 
This suggests a testable prediction of the 
model: Assuming that making initial-link 
stimuli more discriminable from terminal-link 
stimuli means that they are less likely to 
contribute to the criterion, attenuated effects 
of temporal context should be obtained when 
initial- and terminal-link stimuli are more 
discriminable. For example, an experiment 
might compare the magnitude of the terminal- 
or initial-link effect in two conditions that 
differed in terms of whether the initial- and 
terminal-link stimuli differed in both color 
and position (e.g., white side keys for the 
initial links; red or green center keys for the 
terminal links) or in just whether the alterna- 
tive key was illuminated (e.g., initial links 
signalled by left red and right green keys; 
terminal links signalled by extinguishing the 
alternative initial link). Stronger effects of 
temporal context should be obtained in the 
latter condition, where the initial- and termi- 
nal-link stimuli are more similar. 

Some evidence from previous studies sug- 
gests this prediction maybe valid. Grace (1994) 
found that the parameter k, which scales the 
effect of temporal context, was only necessary to 
fit for studies in which the terminal links were 


uncued; otherwise k was equal to 1 . Typically in 
these studies the terminal links were signalled 
by blackout (e.g., Chung & Herrnstein, 1967; 
Gentry & Marr, 1980) , and Grace found that for 
these studies, a better fit was obtained with k < 
1, indicating a weaker effect of overall terminal- 
link duration. The distinctiveness of the initial- 
and terminal-link situations is arguably greater 
when the initial links are signalled by keylights 
with a houselight providing general illumina- 
tion and the terminal links are signalled by 
blackout (as in the uncued studies cited above), 
than when the houselight is always illuminated 
and the only difference between the initial and 
terminal links is which keys are lighted and 
their color. The confusability of the initial- and 
terminal-link situations should be less in the 
former case, and if this reduces the contribu- 
tion of the initial-link delays to the criterion the 
effect of overall terminal-link duration would 
be reduced, consistent with the data. 

The decision model assumes that delays are 
scaled logarithmically. The reason for this 
assumption is simplicity: By using log delays, 
the model is able to use a single parameter (ct) 
which determines the accuracy with which 
delays are judged short or long relative to the 
criterion. This entails that the relative discrim- 
inability of a pair of terminal-link delays 
depends on their ratio and not their absolute 
values, consistent with Weber’s Law. The 
model is able to predict the well-known 
deviations from Weber’s Law in concurrent 
chains — the initial- and terminal-link effects — 
because the initial link delays are included in 
the computation of the criterion. However it 
should be noted that a model which assumed a 
linear scaling of delays could make equivalent 
predictions, provided that the standard devia- 
tion increased proportionally with the criteri- 
on (C). For such a model, the coefficient of 
variation (ct / Q would be the fundamental 
sensitivity parameter (as in Gibbon, 1977), 
comparable to c in the current model. 

One of the most well known results in the 
concurrent-chains literature is preference for 
variability — that is, for a VI schedule over an FI 
schedule that provides the same reinforce- 
ment rate (Herrnstein, 1964). Although the 
decision model predicts preference for VI over 
FI schedules with the same arithmetic mean 
delay, because of the use of log scaling and the 
EWMA to update the terminal-link delay it 
predicts that the VT-FI equivalence value, that 
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is, the FI schedule that should be equally 
preferred to a VI schedule, should occur at the 
geometric mean of the intervals comprising 
the VI. By contrast, most research suggests that 
the VI-FI equivalence value occurs at the 
harmonic mean of the VI intervals (Killeen, 
1968; Mazur, 1984). A task for the future is to 
determine whether the decision model is able 
to provide an adequate account of results of 
studies on preference for variability, and 
whether it is able to predict VI-FI equivalence 
at the harmonic mean. 

It is important to note that although the 
decision model describes the average course of 
acquisition for a pair of terminal-link sched- 
ules, this does not necessarily correspond to 
the actual change in response allocation that 
might be observed in any particular session. 
Like all linear-operator models, the decision 
model predicts a steady approach towards 
asymptote. But data from individual sessions 
rarely show smooth acquisition curves. For 
example, Grace and McLean (2006) examined 
data at the level of session twelfths, and found 
that trajectories within single sessions were 
highly variable (see their Figure 10). More- 
over, there is substantial evidence that abrupt 
switches in response allocation (i.e., from 
favoring one alternative to the other) occur 
within sessions when schedules are changed 
frequently (Gallistel, Mark, King, & Latham, 
2001; Kyonka 8c Grace, 2007, 2008). Because 
the decision model computes the probability 
of a “short” decision — not the actual decision 
that is made — it is limited to describing the 
average course of acquisition. Whether a 
modified version of the model can be applied 
to single sessions (perhaps through simulating 
real-time decisions) is a task for future 
research. 
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